-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL Server: transform aggregate functions over subqueries #34262
Conversation
/// any release. You should only use it directly in your code with extreme caution and knowing that | ||
/// doing so can result in application failures when updating to a new Entity Framework Core release. | ||
/// </summary> | ||
protected override ShapedQueryExpression? TranslateAverage(ShapedQueryExpression source, LambdaExpression? selector, Type resultType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR basically reverts #32374, which added detection of parameterized Contains inside aggregate function invocations, to avoid translating that to an OPENJSON subquery (as usual) but to instead fall back to the traditional IN+constants translation.
This PR now adds general support for IN+subquery - OPENJSON or otherwise - by lifting it out to OUTER APPLY, so we can remove the special casing and use OPENJSON.
Note that this creates more complicated SQL (see below).
SELECT COUNT(CASE | ||
WHEN [c].[City] IN (N'London', N'Berlin') THEN 1 | ||
WHEN [s].[value] = CAST(1 AS bit) THEN 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the parameterized Contains change, as mentioned above. We now do IN+subquery with OPENJSON, just like everywhere else; the new postprocessing step then lifts that subquery up to an OUTER APPLY.
This makes the SQL more complex; I don't think there's a significant perf aspect here since I'm assuming no indexes would be used anyway for these cases. In any case, the OPENJSON vs. constants question is its own separate discussion, so I think it's the right thing to remove the special-casing here for SQL Server.
/cc @cincuranet
AssertSql( | ||
""" | ||
SELECT COALESCE(SUM(CASE | ||
WHEN [s].[value] = CAST(1 AS bit) THEN [s0].[OrderID] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ranma42 you may be slightly interested in this (I know you have no context whatsoever)... This specific test shows us lifting an EXISTS subquery out when it's nested inside an aggregate function call (not supported by SQL Server), and integrating it as an OUTER APPLY, at which point it can be referenced from inside the aggregate function call.
Specifically with EXISTS, it's funny/painful to see how we need to translate the boolean (search condition) to bit for projection out of the OUTER APPLY, and then to translate it back to bool via CASE/WHEN in the projection... So much trouble and ceremony due to the lack of a proper bool type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would something like [s].[value] * [s0].[OrderID]
make sense for this?
EDIT: already handled by #34262 (comment)
FROM [Customers] AS [c] | ||
CROSS JOIN ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the CROSS JOIN instead of CROSS APPLY here, since the subquery is uncorrelated.
e7752d9
to
95146a8
Compare
test/EFCore.SqlServer.FunctionalTests/Query/NorthwindAggregateOperatorsQuerySqlServerTest.cs
Outdated
Show resolved
Hide resolved
SELECT 5 + ( | ||
SELECT MAX([o0].[ProductID]) | ||
FROM [Order Details] AS [o0] | ||
WHERE [o].[OrderID] = [o0].[OrderID]) AS [value] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to have
OUTER APPLY (
SELECT 5 + MAX([o0].[ProductID]) AS [value]
FROM [Order Details] AS [o0]
WHERE [o].[OrderID] = [o0].[OrderID]
) AS [s]
here?
I don't expect this to perform better; it is mainly for readability, so just ignore this if it is not trivial 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good suggestion but it seems orthogonal to this PR (i.e. the original SQL on the left had the same thing). Maybe open an issue to track that potential optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #34314 to track this
130, | ||
(await Assert.ThrowsAsync<SqlException>( | ||
async () => await base.Average_over_nested_subquery_is_client_eval(async))).Number); | ||
// Expected: 46.7126322751323 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use custom asserter to verify results with some error margin:
=> AssertAverage(
async,
ss => ss.Set<Customer>().OrderBy(c => c.CustomerID).Take(3),
selector: c => (decimal)c.Orders.Average(o => 5 + o.OrderDetails.Average(od => od.ProductID)),
asserter: (e, a) => Assert.Equal(e, a, precision: 3));
39c1584
to
2e1f0f8
Compare
Single-handed weekend project!
Closes #34256